{
  "cells": [
    {
      "cell_type": "markdown",
      "id": "23d69f99",
      "metadata": {
        "id": "23d69f99"
      },
      "source": [
        "# Pixtral (12B) Vision Model"
      ]
    },
    {
      "cell_type": "markdown",
      "id": "b842f675",
      "metadata": {
        "id": "b842f675"
      },
      "source": [
        "## Description\n",
        "This notebook demonstrates how to use the **Pixtral (12B) Vision Model** vision-language model for image input and text generation tasks."
      ]
    },
    {
      "cell_type": "markdown",
      "source": [
        "[![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/DhivyaBharathy-web/PraisonAI/blob/main/examples/cookbooks/Pixtral_12B_Vision_Model.ipynb)\n"
      ],
      "metadata": {
        "id": "MilYelUMlD2z"
      },
      "id": "MilYelUMlD2z"
    },
    {
      "cell_type": "markdown",
      "id": "557dba1c",
      "metadata": {
        "id": "557dba1c"
      },
      "source": [
        "## Dependencies\n",
        "```python\n",
        "!pip install transformers accelerate torch torchvision\n",
        "```"
      ]
    },
    {
      "cell_type": "markdown",
      "id": "2b864df8",
      "metadata": {
        "id": "2b864df8"
      },
      "source": [
        "## Tools\n",
        "- 🤗 Transformers\n",
        "- PyTorch\n",
        "- Vision Model APIs"
      ]
    },
    {
      "cell_type": "markdown",
      "id": "1b050001",
      "metadata": {
        "id": "1b050001"
      },
      "source": [
        "## YAML Prompt\n",
        "```yaml\n",
        "mode: vision\n",
        "model: Pixtral (12B) Vision Model\n",
        "tasks:\n",
        "  - image captioning\n",
        "  - visual question answering\n",
        "```"
      ]
    },
    {
      "cell_type": "markdown",
      "id": "44ce2c02",
      "metadata": {
        "id": "44ce2c02"
      },
      "source": [
        "## Main\n",
        "Below is the simplified main execution to load the model and run basic inference."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "id": "c996306e",
      "metadata": {
        "id": "c996306e"
      },
      "outputs": [],
      "source": [
        "%%capture\n",
        "import os\n",
        "if \"COLAB_\" not in \"\".join(os.environ.keys()):\n",
        "    !pip install unsloth\n",
        "else:\n",
        "    # Do this only in Colab notebooks! Otherwise use pip install unsloth\n",
        "    !pip install --no-deps bitsandbytes accelerate xformers==0.0.29.post3 peft trl triton cut_cross_entropy unsloth_zoo\n",
        "    !pip install sentencepiece protobuf \"datasets>=3.4.1\" huggingface_hub hf_transfer\n",
        "    !pip install --no-deps unsloth"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "id": "30147f7c",
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 543,
          "referenced_widgets": [
            "8a0ce0af1025429cb3cfcc5b973a4af0",
            "6fe6ac6a53f745c5a2736e1b6cb01ff7",
            "8d93c4b87f69485e8e97192a8431113f",
            "08af812d04fe481fa84e85a6a591c506",
            "257fe2a05570417882a606b304087a21",
            "6d8e442e771d4955a29a360dee525119",
            "00a911ef0d7f4a6d8b1b4f972b0b4d7c",
            "766a605b2c5a463d977b8fefb554fea5",
            "54ae26be14bc4fb28933af46f45e023a",
            "360f3db84aac4728861b74b5a37adcaa",
            "a2ae977cf14c461e9fa0cbc5c27e7186",
            "d401dcde88274752b4f0786f834527d9",
            "5d10ef16c7154f5392519da3ff6a2f0d",
            "d8358c67c0b0415098b3e4a4e7ad0cbb",
            "4f593d75532d4087bafc02d6ad2b4f79",
            "a519350329d04d5096ccb605c9ffd623",
            "9cd924df198145ed93d9df07249236cb",
            "a01af94f415b4ed18186fa099c3ecbc3",
            "27cd2e02e78d4ef8a52ba73a9e214120",
            "841f197cba3f455d8fe9a9dafac992d7",
            "d6507d692f9743e6b3eed08df94e2a56",
            "c3438845441142fa92fec3acd21b3955",
            "eb64d7acd375493098e1f7c30e7f2781",
            "ad01bece3f444f1382988a65833bebf7",
            "281b9b2311e5460aa4df9bbe9f8e22f0",
            "34afc68bfe9d4e47ad73b31a34a74b32",
            "23c9c94b98f141e8827c37cd26b6138e",
            "d2b5a8a0c51c4d04aa1da73a56c2bcd3",
            "4587ddab1af04157b8cba09f780ce9b0",
            "aab82ec73c0b42fb928fbc74fc44c168",
            "45f90f73051b4f898666ecd14f2818f0",
            "6f96f236961344418015be410aedc57c",
            "6105cefd94bb4a07b9123e461e857c9c",
            "d4d3c9f1f54746d8895b66b6d2888ee2",
            "eaa7a98acf704791a5c3ca37232c94b9",
            "7b1f0430e15a476eb4b59fd5ddb0f9ac",
            "481ce6bb0bf74e9592b5a4ef6cbc6d33",
            "7a35df90dc40472eb0862deb368da362",
            "d4bda244da214ee497b82608944bce6e",
            "2cd8ad1cedb0484fa4c5d47b7b25971c",
            "68c0b7aee1464e78b1bbd420f540fcf4",
            "1fd521ec517348aa9d8dd73757619da4",
            "9c967803321d4ca39b663165c2ce1953",
            "c39f8980d63a4a5ab17f031e06b03f67",
            "b60b17cb67044a82b173f94d781d5174",
            "9078d37b65424033b551caf1ae410d2f",
            "e5d3e2d9fb7e4f8fbfa84ba66ea0a81c",
            "ae50ac57cecb48cb917a2a3ddf3312d0",
            "2a85749ab4144e01bb640cd90a1a8697",
            "96ce0742a2a247dd84d85ad7b5286f19",
            "b67a770e477e40b3a7ca8f4570cb2129",
            "79dfa298db314ce2874e86abd8e6b1ba",
            "38700d9515ca4ee694010fd7914e0d18",
            "7819396bd89b4917beac22678b04b9ec",
            "bdbd0b350dbd49e9b463993a3d5c0e92",
            "8f1faff5468a46d4a725d095ae31ea3c",
            "4a704f4b14614fe6934420a895d9d1a0",
            "e67dfbfbba4d4f4298b7ef302830ab40",
            "9b6572529f65495693d840718d37e744",
            "f06f07a6a1b94dfa9091651c9e95f6f4",
            "c3bea8260e5f493d8a470ff98f025324",
            "60f44f8bb72645eba40d9320d9d6e447",
            "47905056428d4404b73049b3cc437a5a",
            "5edec95909a6434ebf0bfd8a60f5887e",
            "c9900bd2f3374f858680f5a40efa70a0",
            "003f8dc6323b40d29b4f827e14108b20",
            "1ef7d525ad2b4a57ae1118a0c3908b1a",
            "7630787d9003423c81eac646fb88edc6",
            "f41ca30bce18404b953b97548251b556",
            "371f6292472d4befb3d8fcfcca290fed",
            "4334bb17edbd4b279287ea9723a62a3c",
            "b3114bb6194149989d1fa63011299251",
            "4827624c03d848bbb473fe94b2d21c22",
            "3047608c78ef465199d52959a425c5c0",
            "8fb84b1203cb43b5803c11b4a77c0d42",
            "426281caaa3f47e393ea8b001123cb22",
            "6381166be9444c0a9b98186e978e94cb",
            "fe28321b0bf04f58b0d63e7dc617e3ed",
            "77335dfe86c34eb08f9085b6b839b8c7",
            "49e7c650e0cc4b449ddd2d50f989d541",
            "c502fd2bf724410dbcf1528c79f220ce",
            "1b8d5199f7144981b5e9ae71f8ea8199",
            "2fa87811a24f4b6e9695d7b80817a422",
            "25133a67be974e6293bcdb3507529406",
            "fdbff4f0bd6b4989b72f8252b5faea16",
            "189edd39642b421f879956bea51d853e",
            "740ffbd464f041a283b9b2d9396b180c",
            "f5661357cbc144c5947529a979360114",
            "ec91a47eece543a495f36ed0e051d637",
            "443e64f8935544a0b4c159794fd8b272",
            "7dd0095387264941a52e8587ebd0fea1",
            "363406f847db494f972011724185b02a",
            "c21d3532550248b8a927756a510a2770",
            "82f03edadab54f949f92dd0c4a60bd55",
            "d2488adbea5344329da2b447382139f5",
            "fed393eac57747fabee1c2b743828ef8",
            "b1d82fdb948b4c5bbee4d327a0f41a21",
            "cccdb395613a4256950b7c3d740736fd",
            "24c7bdd72f3546c882a7a7c78de5925f",
            "db8a5ea365094bc4a87e77d059b4d0d0",
            "2f494500c50a42f0ba3589743048cdd8",
            "3ec4018cd5e54ba998a0f143f62407cc",
            "f3a87c6d2986429cb40979782b38e712",
            "4f7fb67e816b44598022407aacf71f19",
            "be9935d016a34ac5a272b51261053f13",
            "24540fc3ff1944edaa61e6037bec31c8",
            "713338a99be84b54877f8e35c953d8ba",
            "5586a425c639474ba5d6c0bd6d93328a",
            "088d87f186cc4a5d95745d8317cca20e",
            "acdecdef8bd14e44bcd36374b944b562",
            "1c7c3bfb5ae4472c9bb6e340c012d302",
            "41ae4c0601084403ba26a287f485ebeb",
            "916e61f972f6439b8135a70100753917",
            "36c3f462830247d7a219f1722cae8791",
            "69ae06a22fa9488598ed89641cc135b6",
            "c5ca864cc0144ddeb72dcbf1f8303609",
            "728f786930a14c7a91cb4ff1f97a24ee",
            "71704d8853ed40c4a55fa06669393ec9",
            "af5c196783e349888eaefc227beadbf0",
            "d3aee68f69ad4e5d81a087ec791cc9b1",
            "1ec887eb8dac43ba9922e1f9f2111231",
            "aa6b598305944d5ab2e2b4afaea8063c",
            "a132d68b4ae94f64b80893b9149f7fd8",
            "ec8f420a6f7d413bb528a1661aa3a0e8",
            "b70265a81e9a49109c18eaad44135562",
            "67a1b2e8e39b48849441368c33e16be4",
            "e2ec06c317034b5681d83347b530362f",
            "97bcd619eea74ed99650a1d8fed1f406",
            "14bc6db53fa449049a6a63b422ee54bc",
            "70f9cecf2dca4a3badc609b10ffcc8ec",
            "cc082399aef94823b67d6e9c33d468e5",
            "b4fb273306e941ebaa529e781361acfb"
          ]
        },
        "id": "30147f7c",
        "outputId": "8b36524b-d22e-43f2-a6e0-cc4209d76736"
      },
      "outputs": [
        {
          "name": "stdout",
          "output_type": "stream",
          "text": [
            "🦥 Unsloth: Will patch your computer to enable 2x faster free finetuning.\n",
            "🦥 Unsloth Zoo will now patch everything to make training faster!\n",
            "==((====))==  Unsloth 2024.11.9: Fast Pixtral vision patching. Transformers = 4.46.2.\n",
            "   \\\\   /|    GPU: Tesla T4. Max memory: 14.748 GB. Platform = Linux.\n",
            "O^O/ \\_/ \\    Pytorch: 2.5.1+cu121. CUDA = 7.5. CUDA Toolkit = 12.1.\n",
            "\\        /    Bfloat16 = FALSE. FA [Xformers = 0.0.28.post3. FA2 = False]\n",
            " \"-____-\"     Free Apache license: http://github.com/unslothai/unsloth\n",
            "Unsloth: Fast downloading is enabled - ignore downloading bars which are red colored!\n"
          ]
        },
        {
          "data": {
            "application/vnd.jupyter.widget-view+json": {
              "model_id": "8a0ce0af1025429cb3cfcc5b973a4af0",
              "version_major": 2,
              "version_minor": 0
            },
            "text/plain": [
              "model.safetensors.index.json:   0%|          | 0.00/316k [00:00<?, ?B/s]"
            ]
          },
          "metadata": {},
          "output_type": "display_data"
        },
        {
          "data": {
            "application/vnd.jupyter.widget-view+json": {
              "model_id": "d401dcde88274752b4f0786f834527d9",
              "version_major": 2,
              "version_minor": 0
            },
            "text/plain": [
              "Downloading shards:   0%|          | 0/2 [00:00<?, ?it/s]"
            ]
          },
          "metadata": {},
          "output_type": "display_data"
        },
        {
          "data": {
            "application/vnd.jupyter.widget-view+json": {
              "model_id": "eb64d7acd375493098e1f7c30e7f2781",
              "version_major": 2,
              "version_minor": 0
            },
            "text/plain": [
              "model-00001-of-00002.safetensors:   0%|          | 0.00/4.97G [00:00<?, ?B/s]"
            ]
          },
          "metadata": {},
          "output_type": "display_data"
        },
        {
          "data": {
            "application/vnd.jupyter.widget-view+json": {
              "model_id": "d4d3c9f1f54746d8895b66b6d2888ee2",
              "version_major": 2,
              "version_minor": 0
            },
            "text/plain": [
              "model-00002-of-00002.safetensors:   0%|          | 0.00/3.57G [00:00<?, ?B/s]"
            ]
          },
          "metadata": {},
          "output_type": "display_data"
        },
        {
          "data": {
            "application/vnd.jupyter.widget-view+json": {
              "model_id": "b60b17cb67044a82b173f94d781d5174",
              "version_major": 2,
              "version_minor": 0
            },
            "text/plain": [
              "Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]"
            ]
          },
          "metadata": {},
          "output_type": "display_data"
        },
        {
          "data": {
            "application/vnd.jupyter.widget-view+json": {
              "model_id": "8f1faff5468a46d4a725d095ae31ea3c",
              "version_major": 2,
              "version_minor": 0
            },
            "text/plain": [
              "generation_config.json:   0%|          | 0.00/133 [00:00<?, ?B/s]"
            ]
          },
          "metadata": {},
          "output_type": "display_data"
        },
        {
          "data": {
            "application/vnd.jupyter.widget-view+json": {
              "model_id": "1ef7d525ad2b4a57ae1118a0c3908b1a",
              "version_major": 2,
              "version_minor": 0
            },
            "text/plain": [
              "processor_config.json:   0%|          | 0.00/162 [00:00<?, ?B/s]"
            ]
          },
          "metadata": {},
          "output_type": "display_data"
        },
        {
          "data": {
            "application/vnd.jupyter.widget-view+json": {
              "model_id": "fe28321b0bf04f58b0d63e7dc617e3ed",
              "version_major": 2,
              "version_minor": 0
            },
            "text/plain": [
              "chat_template.json:   0%|          | 0.00/1.59k [00:00<?, ?B/s]"
            ]
          },
          "metadata": {},
          "output_type": "display_data"
        },
        {
          "data": {
            "application/vnd.jupyter.widget-view+json": {
              "model_id": "ec91a47eece543a495f36ed0e051d637",
              "version_major": 2,
              "version_minor": 0
            },
            "text/plain": [
              "preprocessor_config.json:   0%|          | 0.00/483 [00:00<?, ?B/s]"
            ]
          },
          "metadata": {},
          "output_type": "display_data"
        },
        {
          "data": {
            "application/vnd.jupyter.widget-view+json": {
              "model_id": "db8a5ea365094bc4a87e77d059b4d0d0",
              "version_major": 2,
              "version_minor": 0
            },
            "text/plain": [
              "tokenizer_config.json:   0%|          | 0.00/177k [00:00<?, ?B/s]"
            ]
          },
          "metadata": {},
          "output_type": "display_data"
        },
        {
          "data": {
            "application/vnd.jupyter.widget-view+json": {
              "model_id": "1c7c3bfb5ae4472c9bb6e340c012d302",
              "version_major": 2,
              "version_minor": 0
            },
            "text/plain": [
              "tokenizer.json:   0%|          | 0.00/17.1M [00:00<?, ?B/s]"
            ]
          },
          "metadata": {},
          "output_type": "display_data"
        },
        {
          "data": {
            "application/vnd.jupyter.widget-view+json": {
              "model_id": "aa6b598305944d5ab2e2b4afaea8063c",
              "version_major": 2,
              "version_minor": 0
            },
            "text/plain": [
              "special_tokens_map.json:   0%|          | 0.00/552 [00:00<?, ?B/s]"
            ]
          },
          "metadata": {},
          "output_type": "display_data"
        }
      ],
      "source": [
        "from unsloth import FastVisionModel # FastLanguageModel for LLMs\n",
        "import torch\n",
        "\n",
        "# 4bit pre quantized models we support for 4x faster downloading + no OOMs.\n",
        "fourbit_models = [\n",
        "    \"unsloth/Llama-3.2-11B-Vision-Instruct-bnb-4bit\", # Llama 3.2 vision support\n",
        "    \"unsloth/Llama-3.2-11B-Vision-bnb-4bit\",\n",
        "    \"unsloth/Llama-3.2-90B-Vision-Instruct-bnb-4bit\", # Can fit in a 80GB card!\n",
        "    \"unsloth/Llama-3.2-90B-Vision-bnb-4bit\",\n",
        "\n",
        "    \"unsloth/Pixtral-12B-2409-bnb-4bit\",              # Pixtral fits in 16GB!\n",
        "    \"unsloth/Pixtral-12B-Base-2409-bnb-4bit\",         # Pixtral base model\n",
        "\n",
        "    \"unsloth/Qwen2-VL-2B-Instruct-bnb-4bit\",          # Qwen2 VL support\n",
        "    \"unsloth/Qwen2-VL-7B-Instruct-bnb-4bit\",\n",
        "    \"unsloth/Qwen2-VL-72B-Instruct-bnb-4bit\",\n",
        "\n",
        "    \"unsloth/llava-v1.6-mistral-7b-hf-bnb-4bit\",      # Any Llava variant works!\n",
        "    \"unsloth/llava-1.5-7b-hf-bnb-4bit\",\n",
        "] # More models at https://huggingface.co/unsloth\n",
        "\n",
        "model, tokenizer = FastVisionModel.from_pretrained(\n",
        "    \"unsloth/Pixtral-12B-2409\",\n",
        "    load_in_4bit = True, # Use 4bit to reduce memory use. False for 16bit LoRA.\n",
        "    use_gradient_checkpointing = \"unsloth\", # True or \"unsloth\" for long context\n",
        ")"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "id": "1c0cd2b0",
      "metadata": {
        "id": "1c0cd2b0"
      },
      "outputs": [],
      "source": [
        "model = FastVisionModel.get_peft_model(\n",
        "    model,\n",
        "    # We do NOT finetune vision & attention layers since Pixtral uses more memory!\n",
        "    finetune_vision_layers     = True, # False if not finetuning vision layers\n",
        "    finetune_language_layers   = True,  # False if not finetuning language layers\n",
        "    finetune_attention_modules = False, # False if not finetuning attention layers\n",
        "    finetune_mlp_modules       = True,  # False if not finetuning MLP layers\n",
        "\n",
        "    r = 8,           # The larger, the higher the accuracy, but might overfit\n",
        "    lora_alpha = 8,  # Recommended alpha == r at least\n",
        "    lora_dropout = 0,\n",
        "    bias = \"none\",\n",
        "    random_state = 3407,\n",
        "    use_rslora = False,  # We support rank stabilized LoRA\n",
        "    loftq_config = None, # And LoftQ\n",
        "    # target_modules = \"all-linear\", # Optional now! Can specify a list if needed\n",
        ")"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "id": "440fa965",
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 177,
          "referenced_widgets": [
            "453a07e03ee94f528890205d14511da6",
            "309ebf7d588244a3ba28d29d56edbc2a",
            "dcc1ef7862e240eebefd5985202ef3b2",
            "0ef444f87c454640aa657c4c187e94f5",
            "d291b920db4b48fb9f3e89bc916db8c8",
            "3570f6aec06d461e9259a2bd60c6a16a",
            "3ed61e45df71452f844c1f39ea5c9dcf",
            "2b6b7fd9d5204198953a0f6fe8333d57",
            "e74c5834491b46d5b065665533671407",
            "fc9a5b89da594730a09a6bb4c3adb879",
            "22b0b4a4585148b2b6f75ecd86ebded3",
            "94a24760f2e3453aa615caf5300cd9da",
            "52e83aca1e544ffaabc57d149dd0415e",
            "b2e5c457d6a246cabcd71459cf2cee8d",
            "b8b672b219c347c4a636c1a43f73fda9",
            "76e1e2ae0a2944de8eff1c8eea325164",
            "6795172d2bb54298ae68cb481cf84822",
            "c09587db5ef3473490c1638c359ca945",
            "58e75f4e6c3e4ca9a6e6bc6e9865de67",
            "0fd6dae874a344678bac899cca709c3b",
            "da76163d3abd4ab9b4a7d02efdb15f00",
            "dc79ba710e9f4485a6f51d48963c6666",
            "9f45d79473ef446ca8f81c08ec82edda",
            "c381bddbbb6a4bbd99f90546dfb1867c",
            "9ca49f19650848dfbade44d036e166d2",
            "348f47ec75d44295b089a0071b747010",
            "d7601975774c43e4af927247019c2dab",
            "46513f9858a8403b93fc2fb1a0ea8779",
            "3d378819a2084a7797e8e8debf9bf815",
            "d2080299e24d40cbaecf75aead507415",
            "7d64c5f1dac64a49b794e723a5c77620",
            "08acfb1a242542d9990577d62118185f",
            "6cdfb3c8a1bc45b1ad34dd77feb4d4fe",
            "7856e0634b86431382280c591b6f740f",
            "cac2e90d5e3449978089a9fe4690cd93",
            "01838b1fb6da41e6bee2c632f9873b6e",
            "12b8a9e56ff646d4ba7032405183daf3",
            "2ef9f2c39bdf4893a4016e0880445e0f",
            "d358e59f602245fc96651c018347c448",
            "b7aeadfe764542e3ae4dd47e2664e410",
            "10523b38a16b4a48b095860653d34a6c",
            "ac9cb94447274626afc8580dd17260ad",
            "c6bc57fe79f045fabe94281cbe1bd09a",
            "05e888e211274b269794cde826c7d1f9",
            "6b2c1870e4ae44aba0326535f75c6858",
            "c3997e6823fb40349d9c92e32ce28045",
            "a4cf715dfb534feb814d426c2cac2b8e",
            "f57ddd4b011e428e82a86317d15779aa",
            "be6e313fb0b045adb8d8da6702081a0d",
            "dc07e208294f47d8b122afc8d415e3c8",
            "c2d88ba605ab4c2ab3f9686c5b65b8e8",
            "625723d424624ab2ac641b2c369da9e8",
            "35f6d9677b67401699e094040eaf8bd2",
            "ed9c8e6297654278bb360c13134920cb",
            "39d1ac1f430544c087b69e3bdb6ab5d6"
          ]
        },
        "id": "440fa965",
        "outputId": "54b0ee23-5573-407a-d22f-cda8378461da"
      },
      "outputs": [
        {
          "data": {
            "application/vnd.jupyter.widget-view+json": {
              "model_id": "453a07e03ee94f528890205d14511da6",
              "version_major": 2,
              "version_minor": 0
            },
            "text/plain": [
              "README.md:   0%|          | 0.00/728 [00:00<?, ?B/s]"
            ]
          },
          "metadata": {},
          "output_type": "display_data"
        },
        {
          "data": {
            "application/vnd.jupyter.widget-view+json": {
              "model_id": "94a24760f2e3453aa615caf5300cd9da",
              "version_major": 2,
              "version_minor": 0
            },
            "text/plain": [
              "train-00000-of-00001.parquet:   0%|          | 0.00/357M [00:00<?, ?B/s]"
            ]
          },
          "metadata": {},
          "output_type": "display_data"
        },
        {
          "data": {
            "application/vnd.jupyter.widget-view+json": {
              "model_id": "9f45d79473ef446ca8f81c08ec82edda",
              "version_major": 2,
              "version_minor": 0
            },
            "text/plain": [
              "test-00000-of-00001.parquet:   0%|          | 0.00/57.6M [00:00<?, ?B/s]"
            ]
          },
          "metadata": {},
          "output_type": "display_data"
        },
        {
          "data": {
            "application/vnd.jupyter.widget-view+json": {
              "model_id": "7856e0634b86431382280c591b6f740f",
              "version_major": 2,
              "version_minor": 0
            },
            "text/plain": [
              "Generating train split:   0%|          | 0/8552 [00:00<?, ? examples/s]"
            ]
          },
          "metadata": {},
          "output_type": "display_data"
        },
        {
          "data": {
            "application/vnd.jupyter.widget-view+json": {
              "model_id": "6b2c1870e4ae44aba0326535f75c6858",
              "version_major": 2,
              "version_minor": 0
            },
            "text/plain": [
              "Generating test split:   0%|          | 0/1364 [00:00<?, ? examples/s]"
            ]
          },
          "metadata": {},
          "output_type": "display_data"
        }
      ],
      "source": [
        "from datasets import load_dataset\n",
        "dataset = load_dataset(\"unsloth/llava-instruct-mix-vsft-mini\", split = \"train\")"
      ]
    },
    {
      "cell_type": "markdown",
      "id": "b9670b46",
      "metadata": {
        "id": "b9670b46"
      },
      "source": [
        "## Output\n",
        "This shows a basic output example of the vision-language model."
      ]
    }
  ],
  "metadata": {
    "colab": {
      "provenance": []
    }
  },
  "nbformat": 4,
  "nbformat_minor": 5
}